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Abstract — We investigate the behavior of a large number of 
selfish users that are able to switch dynamically between multiple 
wireless access-points (possibly belonging to different standards) 
by introducing an iterated non-cooperative game. Users start 
out completely uneducated and nai've but, by using a fixed set of 
strategies to process a broadcasted training signal, they quickly 
evolve and converge to an evolutionarily stable equilibrium. Then, 
in order to measure efficiency in this steady state, we adapt the 
notion of the price of anarchy to our setting and we obtain an 
explicit analytic estimate for it by using methods from statistical 
physics (namely the theory of replicas). Surprisingly, we find 
that the price of anarchy does not depend on the specifics of the 
wireless nodes (e.g. spectral efficiency) but only on the number of 
strategies per user and a particular combination of the number 
of nodes, the number of users and the size of the training signal. 
Finally, we map this game to the well-studied minority game, 
generalizing its analysis to an arbitrary number of choices. 

Index Terms — Wireless networks, Nash equilibrium, correlated 
equilibrium, price of anarchy, evolutionary game, replicas 



I. Introduction 

AS a result of the massive deployment of IEEE 802.11 
wireless networks, and in the presence of large-scale mo- 
bile third-generation systems, mobile users often have several 
choices of overlapping networks to connect to. In fact, devices 
that support multiple standards already exist and, additionally, 
significant progress has been made towards creating flexible 
radio devices capable of connecting to any existing standard 
[1]. It is thus reasonable to expect that, in the near future, users 
will be able to switch dynamically between different networks. 

In such a setting, even though users have several choices to 
connect to, they still have to compete against each other for the 
finite resources of the combined network. Hence, this situation 
can be modelled using non-cooperative game theory, a practice 
that is rapidly becoming one of the main tools in the analysis 
of wireless networks. For example, game-theoretic techniques 
were used to optimize transmission probabilities in [2] and to 
calculate the optimal power allocation [3]-[6] or the optimal 
transmitting carrier in [7]. The authors of [8] and [9] studied 
the possibility of connecting to several access points using a 
single WLAN card; the selfish behavior of service providers 
was analyzed in [10]-[12] and, recently, even the effects of 
pricing were examined in [13]-[16] using game theory. 
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The scenario that we consider is an unregulated network 
where a large number of heterogeneous users (e.g. mobile 
devices) connect wirelessly to one of B nodes (perhaps with 
diff'erent standards). All users wish to maximize their individ- 
ual downlink throughput but each has a different approach: 
e.g. users may have different tolerance for delay, or may wish 
to employ different "betting" schemes to download data at the 
lowest price. So, in general, users have different strategies, 
fixed at the outset of the game, and unknown to the rest. 

Now, given the users' competition for the nodes' limited 
resources, it is not clear how they can reach an organized state 
in the absence of a central coordinating entity. One possible 
way to overcome this hurdle is if users base their decisions on 
a "training" signal, e.g. a random signal that is synchronously 
broadcasted by the nodes and received by all the users. Then, 
as this affair is iterated, one might hope that sophisticated users 
develop an insight into how other users respond to the same 
stimulus and, eventually, learn to coordinate their actions. This 
was precisely the seminal idea behind Aumann's work in [17]: 
players base their decisions on their observations of the "states 
of the world" and reach a correlated equilibrium. 

Similar games have also been studied in econophysics, 
particularly after the introduction of the El Parol problem in 
[18] and the development of the minority game in [19]. In 
both these games, players "buy" or "sell" and are rewarded 
when they land in the minority. Again, the key idea is that 
in order to decide what to do, players record and process the 
game's history with the aid of some predetermined strategies. 
Then, by employing more often the strategies that perform 
better, they quickly converge to an equilibrium which (in an 
unexpected twist) turns out to be oblivious to the source of the 
players' observations [20]. In fact, it was shown in [21] that 
what matters is simply the amount of feedback that players 
receive and the number of strategies they use to process it. 

As in [22], our scenario stands to gain a lot from such 
an approach. Hence, our main goal will be to expound this 
scheme in a way appropriate for selfish users in an unregulated 
wireless network. The first step towards this is to generalize 
and adapt the minority game of [21] to our setting: this is 
done in section where we introduce the Simplex Game. 
Next, in section [Till we characterize the game's equilibria 
and compare them to the socially optimal state. From this 
comparison emerges the game's price of anarchy, a notion first 
described in [23] and which measures the distance between 
anarchy (equilibria) and efficiency (optimal states). 

Our first important result is obtained in section|lVl by iterat- 
ing the game based on the scheme of exponential learning, we 
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find that players converge to an evolutionarily stable equilib- 
rium (theorem fTTTi. Then, having estabHshed convergence, we 
proceed in section |V] to harvest the game's price of anarchy. 
Quite unexpectedly, we find that the price of anarchy is unaf- 
fected by disparities in the nodes' characteristics (theorem ll4l l. 
Moreover, we also derive an analytic expression for the price 
of anarchy based on the method of replicas from statistical 
physics. This allows us to study the effect of the various 
parameters on the network's performance, an analysis which 
we supplement with numerical experiments. As a byproduct, 
this generalizes the results of the traditional (binary) minority 
game to an arbitrary number of choices. 

Some calculational details that would detract one's focus 
from the main discussion have been deferred to the appendices 
at the end. Finally, as far as notational conventions go, we 
will denote the standard (« - l)-dimensional simplex of W by 
A„ = {x 6 R" : Xj > Q and 2, jc,- = 1); also, we will employ 
the game-theoretic shorthand: {x_i\y) - {x\ . . .y . . .x^)- 

II. The Simplex Game 

To model the scenario that we described in the introduction, 
we consider users that may choose one of B nodes, each 
characterized by a single user spectral efficiency c^. In this 
case, if A^^ users connect to node r, their throughput will be: 

- ^ (1) 

(for simplicity we assume that users have the same transmis- 
sion characteristics). 

Despite the simplicity of this throughput model, it has been 
shown to be of the correct form for TCP and UDP protocols 
in IEEE 802.11b systems, if we limit ourselves to a single 
class of users [9]. Furthermore, in the case of third-generation 
best-effort systems, the realistic total cell-service throughput is 
approximately constant beyond a certain number of connected 
users [24]. Thus, ([T) is a reasonable approximation for the 
user throughput of single-class mobiles. 

In fact, equation ([T]i is flexible enough to account for 
parameters that affect a user's bias towards a node; e.g. we 
can incorporate pricing by modifying c,. to Cr{l - Pr) where 
Pr reflects the price per bit. So, we may renormalize ([T]i to: 



where the coefficients y,- are normalized to unity (2f=i = 1) 
and represent the "effective strength" of node r in terms of its 
attributes and characteristics. Clearly, nodes can modify this 
"strength" score, in order to maximize their gain; however, 
this is assumed to take place at slower time-scales and, hence, 
these strengths can be assumed to remain constant^] 

We may now note that the core constituents of a congestion 
game are all present: players (users) are asked to choose 
one of B facilities (nodes), their payoff given by the throughput 
(|2]i. From this standpoint, the "fairest" user distribution is the 
Nash allocation of yrN users to node r. when distributed this 
way, users receive a payoff of mq = 1 and no one could hope 
to earn more from a unilateral deviation (comparably to the 

'obviously, nodes of zero strength (e.g. negligible spectral efficiency) will 
not appeal to any reasonable user and can be dropped from the analysis. 



"water-filling" of e.g. [25]). As a result, the users' discomfort 
can be gauged by contrasting their payoff to the Nash value: 

"-•-"o^^riv^^-l^^+ai/A^) (3) 
So, if we focus on the leading term of ([3]) and introduce: 



we may easily that the Nash equilibria of the game remain 
invariant under this linearization. In other words, the payoffs 
^ and (|4]i will be equivalent in terms of social fairness|3 

Thanks to this linearization, we may express a user's payoff 
in a particularly revealing form. However, to accomplish this, 
we first need to introduce a collection of B vectors in R*"' 
with which to model the nodes: 

Definition 1: Let y = (yi . . .Jb) e Int(AB) be a strength dis- 
tribution for B nodesH A y-simplex (or y-appropriate simplex) 
is a coflection ^ - {q,)f^i £ R^"' such that, for r, Z = I ...B: 

q.-q/ = -l + ^ (5) 

Admittedly, this definition is rather opaqu^ but, fortunately, 
the geometric picture is much clearer: 

Lemma 2: Let ^ - {qrlf^j be a y-appropriate simplex for 
some y e Int(AB). Then: ^jrYrtlr - 0; also: Yjryill - B - I. 
Proof: To estabhsh the first part, note that: (Xf^i yr^lr) = 

Lf,i=iyryiqr-qi = I,fj=iyryi{-i +5ril ^Jyryl) = 0. As for the 

second part, it is just a straightforward application of Q. ■ 
In other words, a y-simplex is just like a standard simplex 
with vertices "weighted" by the strengths y,^ So, if A^,- players 
choose q,., we may consider the aggregate bet q = 2^, Ni(\i 
and obtain by ©: q,-q = Sf^i A?;q.-q/ = -A?(l - We 
then get the very useful expression for the payoff Q: 

= 1 - |v = -ifl'-q = 'W-^=i (6) 

where r,- indicates the choice of player /. In this way, lemma 
|2] shows that Nash equilibria will be characterized by: 

q = i:f=iq., = i:f=i}'.A^q. = (7) 

i.e. the game will be at equilibrium when the players' choices 
balance out the weights yr at the vertices of the simplex. 

Unfortunately, it remains unclear how this Nash allocation 
can be achieved in an unregulated network. For this reason, 
we will introduce a coordination mechanism akin to the one 
proposed by Aumann in his seminal paper [17]. In a nutshell, 
Aumann's scheme is that players observe the random events 
y that transpire in some sample space F (the "states of the 
world") and then place their bets based on these observations. 
In other words, players' decisions are ordained by their (cor- 
related) strategies f, i.e. functions on F that convert events 
("states") y 6 F to actions (betting suggestions) //(y). 

"This is also veiified by our numerical experiments (see figure [T}. 

^To clear up any confusion: Int(AB) = {y e : y,- > and Hr.Vr = 1|- 

"^In fact, it is not even cleai' that the definition is not vacuous. This is shown 
in appendix |A] y-simplices are pretty easy to construct for any y e Int(As). 

'One could also ask here why we insist that y-simplices be embedded in 
R'*"' instead of R". The reason for this is quite subtle and hinges on the fact 
that we need .3^ to span the space it is embedded in, so that we may apply 
the Hubbard-Stratonovich transformation (see appendix IbI. 
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Inspired by [17] (and also [21]), we propose that a broad- 
cast beacon transmit a training signal m, drawn from some 
(discrete) sample space For example, the nodes could be 
synchronously broadcasting the same integer m e { 1 . . . M), 
drawn from a uniform random sequence that is arbitrated e.g. 
by a government agency such as the FCC in the US. To 
process this signal, user / has at his disposal S .^-valued 
random variables c,-., : ^ — > £i§ (s - 1...5)|^ these are 
the /* user's strategies, used to convert the signal m to an 
action C/s(»z) = c™ 6 So, if user / employs strategy i,-, 
the collection of maps {c,s, :.-# — > will be a correlated 

strategy in the sense of [17] (contrast with {/iljlj above). 

However, unlike [17], we cannot assume that users develop 
their strategies after careful contemplation on the "states of 
the world". After all, it is quite unlikely that a user will 
have much time to think in the fast-paced realm of wireless 
networks. Consequently, when the game begins, we envision 
that each user randomly "preprograms" S strategies, drawn 
randomly from all the possible B*' maps ^ — > Of course, 
since we assume users to be heterogeneous, they will program 
their strategies in wildly different ways and independently of 
one another. Still, rational users will exhibit a predisposition 
towards stronger nodes; to account for this, we will posit that: 

P(c;'; = q,) = yr (8) 

i.e. the probability that user / programs node q,. as response to 
the signal m is just the node's strength y^- In effect, strategies 
are picked in anticipation of competition with other users: 
specifically, if each user were expecting to play alone, he 
would have picked strategies that lead to the strongest node. 

We may now summarize the above in a formal definition: 
Definition 3: Let y e Int(AB) be a strength distribution for 
B nodes. Then, a y-appropriate simplex game (S consists of: 

1) the set of players: o/K = {1 . . .A^); 

2) the set of nodes: - {qrlf^p where is a y-simplex; 

3) the set of signals: ^ - {1...M), endowed with the 
uniform measure Qaini) — j^; the ratio /I = will be 
called the training parameter of the game; 

4) the set of strategy choices: - {I . . . S }; also, for each 
player / 6 a probability measure pi{s) = pis on 
(1^9=1 Pis - !)■ these are the players' mixed strategies; 

5) a strategy matrix c : .yKxS^x^ — » ^ where c(i, s, m) = 
c™ e ^ is the node that the strategy of user / indicates 
as response to the signal m 6 ^; the entries of c are 
drawn randomly based on: P(c™ - q, ) - yr- 

Moreover, we endow Q = ^ x y'^ with the product 
measure qq x H/Ii Pi and define the following: 

6) an instance of (5 is an event oj - {m, si, . . . s^) of Q.; 

7) the bet of player / is the i^-valued random variable: 
b,((x)) - c{i. Si, m); also, the aggregate bet is: b = b,-; 

8) the payoff for player / is the rv.: m, = ^jj^i ' b. 

Thus, similarly to the minority game of [19] and [21], the 
sequence of events that we intuitively envision is0 

*We are assuming that 5 is the same for all users for the sake of simplicity. 
^It is important to note here that, for 2 identical nodes (SS = |-1, 1|), the 
simplex game reduces exactly to the original minority game of [21]. 



. in the "initialization" phase (steps 1-5), players program 

their strategies by drawing the strategy matrix c; 
. in step 6, the signal m is broadcasted and, based on p,, 
players pick a strategy s e to process it with: pts is 
the probability that user / employs his i* strategy; 
. in steps 7-8, players connect to the nodes that their strate- 
gies indicate (b,(m, si . . . s^) - c™ ) and receive the hnear 
payoff (IHi: by eq. (|6]l, each of the A',, users that end up 
connecting to node q^ receives: -jj(lr ■ Zz-^/Q/ = 1 ~ 
. the game is iterated by repeating steps 6-8. 
As usual, the payoff that corresponds to the (mixed) strategy 
profile p = {p\...pN) will be the multilinear extension: 
Ui(m,p) = Pis, ■ --Pnsh Ui(m, si... sn). To avoid carrying 
cumbersome sums like this, we will follow the notation of [21] 
and use (■) to indicate expectations over a particular player's 
mixed strategy:(L',) = 2 , PisVii, also, we will use an overline to 
denote averaging over the training signals, as in: a = -jg 2,„ a"\ 

III. Selfishness and Efficiency 

Clearly, the only way that selfish users who seek to max- 
imize their individual throughput can come to an unmedi- 
ated understanding is by reaching an equilibrial state that 
discourages unilateral deviation. But, since there is a palpable 
difference between the users' strategic decisions {s € and 
the tactical actions they take based on them (c™ e M), one 
would naturally expect the situation to be somewhat involved. 

A. Notions of Equilibrium 

Indeed, it should not come as a surprise that this dichotomy 
between strategies and actions is reflected on the game's 
equilibria. On the one hand, we have already encountered the 
game's tactical equilibrium: it corresponds to the Nash alloca- 
tion of y, A^ users to node r. On the other hand, given that users 
only control their strategic choices, we should also examine 
Aumann's strategic notion of a correlated equilibrium. 

To that end, recall that a correlated strategy is a collection 
/ = {/ilj^j of maps fi : — > (one for each player) that 
convert the signal m to a betting suggestion fi{m) € We will 
then say that a (pure) correlated strategy / is at equilibrium for 
player i when, for all perturbations if-i',gi) - {fi . . . gi . . .fN) 
of /, player / gains more (on average) by sticking to f, i.e. 
M,(/) > Uiif-C gi). When this is true for all players / e , f 
will be called a correlated equilibrium. 

As we saw before, if user / picks his strategy, the collec- 
tion [cisi : ^ ^}f=i is a correlated strategy, but the converse 
need not hold: in general, not every correlated strategy can be 
recovered from the limited number of preprogrammed strategic 
choices^ Thus, users will no longer be able to consider all 
perturbations of a given strategy, and we are led to: 

Definition 4: In the setting of definition [3] a strategy profile 
p — (p\ . . . pn) is a constrained correlated equilibrium when, 
for all strategy choices s e and for all players / e ^: 

ii 2m M/(m, p) > i 2m «/■('"> P-r, S). (9) 

'^There is a total of B'^^ correlated strategies but users can recover at most 
of them. In fact, this is why preprogramming is so useful: it would be 
highly unreasonable to expect a given user to process in a timely fashion the 
exponentially growing number of B** (as compared to S) strategies. 
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The set of all such equilibria of (5 will be denoted by A^C®). 

In our setting, a (constrained) correlated equilibrium is 
what represents anarchy: with no one to manage the users' 
selfish desires, the only thing that deters them from unilateral 
deviation is their expectation of (average) loss. Conceptually, 
this is pretty similar to the notion of a Nash equilibrium, the 
main difference being that in a correlated equilibrium we are 
averaging the payoff over the training signals. This analogy 
will be very useful to us and we will make it precise by 
introducing the correlated form of the simplex game: 

Definition 5: The correlated form of a simplex game © is 
a game (5* with the same set of players ,jV - { 1 . . . A^), each 
one choosing an action from ,y - { \ . . .S] for a payoff of: 



u*{si . . . i^) = - 2,„ M,(m, si... sn) 



(10) 



In short, the payoff that players receive in the correlated game 
is their throughput averaged over a rotation of the training 
signals. Then, an important consequence of definition |4] is that 
the constrained correlated equilibria of a simplex game © are 
precisely the Nash equilibria of its correlated form ©*. 

B. Harvesting the Equilibria 

So, our next goal will be to understand the Nash equilibria 
of ®*. To begin with, a brief calculation shows that the payoff 
u* for a mixed profile p - {pi . . . p^) is: 



Henceforward, our working assumption will be that there 
are no degenerate nodes: otherwise, we could simply remove 
them from the analysis (i.e. reduce B and modify y accord- 
ingly). This reflects the fact that degeneracy in the strength 
distribution simply indicates that certain nodes have extremely 
low strength scores and all reasonable users shun them0 

With this in mind, the last term of (fT2l) will be on average 
and with a variance of lesser order than the first term. Thus: 

uip-h S2) - u{p-i; S]) ~ 2[m*(/9_,-; 52) - u^P-i'^ ^i)] (13) 

i.e. the aggregate payoff u* is indeed a potential function for 
the game ©* (at least asymptotically). We have thus proven: 
Lemma 7: Let © be a simplex game for players. Then, 
as — > 00, the maxima of the averaged aggregate u* - ^j^j u* 
will correspond (almost surely) to correlated equihbria of ®. 



C. Anarchy and Efficiency 

Still, one expects quite the gulf between anarchic and 
efficient states: after all, selfish players are hardly the ones 
to rely upon for social efficiency. In the context of networks, 
this contrast is frequently measured by the price of anarchy, 
a notion first introduced in [23] as the (coordination) ratio 
between the maximum attainable aggregate payoff and the one 
attained at the game's equilibria. Then, depending on whether 
we look at worst or best-case equilibria, we get the pessimistic 
or optimistic price of anarchy respectively. 



2 I =; -A or opnmisiic price 01 anarciiy respecuveiy. 

M,(pi ■ ■ -Pn) = |(^')' ■,■ (*-./) + (^r)i ^^^^ In our game, the aggregate payoff is equal to: u - m,- 



(the averaging notations (■) and (■) being as in the end of 
section |ll|i. Thus, given the similarities of our game with 
congestion games, it might be hoped that its Nash equilibria 
can be harvested by means of a potential function, i.e. a 
function that measures the payoff difference between users' 
individual strategies [26]. More concretely, a potential U 
satisfies: u*(p_i; sj) - u*.{p_i, S2) = Uip^f, si) - U{p-i\ si) for 
any mixed profile p - {p\ . . . ps) and any two strategic choices 
of player /. Obviously then, if a potential function exists, 
its local maxima will be Nash equilibria of the game. 

But, unfortunately, since ®* does not have an exact conges- 
tion structure, it is not clear how to construct such a potential. 
Nevertheless, a good candidate is the game's aggregate payoff 
M* = 2iM*. In fact, if player ; chooses strategy s, u* becomes: 



u*(p-i; s) 



1 



So, after some similar algebra for u*^{p) = u*(p-i; s), we obtain 
the following comparison between two strategies si,S2 6 =5^: 



uXp^r,S2)-uXp-r,si) = 2[u^p) - ul{p)] + j^{cl_ - cl) (12) 

Now, given the preprogramming (jSj of c, we note that (c™)^ 
takes on the value qj. = -I + ^ with probability y,-. Hence, 
the central limit theorem (recall that M - AN - 0{N)) implies 
that i Sfti [c'llf wiU have mean ZrVrij;, - 1) = B - 1 and 
variance Tii {^ — latter being negligible unless y is 

too close to the faces of Ag. More concretely: 

Definition 6: A distribution y e Int(AB) is proper when 
■g^r Ijf=i ( v" ~" ^) ~ ''(1)' otherwise, y is called degenerate. 



N 2™ I ^iYj'j=i = ^Ji^^ ^nd attains a maximum of t/^ax = 
when b = 0. So, if we recall by (|7]) that a Nash equilibrium 
occurs if and only if b = 0, we see that Nash anarchy 
does not impair efficiency. Clearly, neither the users, nor the 
agencies that deploy the wireless network could hope for a 
better solution! 

However, this also shows that the traditional definition of 
the price of anarchy is no longer suitable for our purposes. 
One reason is that Wmax - and, hence, we cannot hope to get 
any information from ratios involving Mmax What's more, 
the users' selfishness in our setting is more aptly captured by 
the Aumann equilibria of definition |4] so we should be taking 
the signal-averaged m* instead of u. As a result, we are led to: 

Definition 8: Let © be a simplex game for players and 
B nodes. Then, if p - (pi . . .p^) is a mixed strategy profile 
of ®, we define its frustration level to be: 

Rip)^-^iu'ip)^Mih)iiZmbHp) (14) 

that is, the (average) distance from the Nash solution b = 0. 
Also, the game's correlated price of anarchy R{®) will be: 



R{(5) = inf [r(p) : p e A°(®)} 



(15) 



i.e. the minimum value of the frustration level over the set 
A°(®) of the game's constrained correlated equilibria. 

Some remarks are now in order: first and foremost, we 
see that the frustration level of a strategy profile measures 

'After all, degenerate nodes cannot serve more than o{ ^fff) users. 
'"This actually highlights a general problem with the coordination ratio: it 
does not behave well w.rt. adding a constant to the payoff functions. 
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(in)efficiency by contrasting the average aggregate payoff 
to the optimal case Mmax = (the normalization -gij has 
been introduced for future convenience). So, with correlated 
equilibria representing the anarchic states of the game, we 
remain justified in the eyes of [23] by calling R((5) the price 
of anarchy. In effect, the only thing that sets us apart is that, 
instead of a ratio, we are taking the difference. 

Finally, one might wonder why we do not consider the 
pessimistic version by replacing the inf of the above definition 
with a sup. The main reason for this is that in the next section, 
we will present a scheme with which users will be able to 
converge to their most efficient equilibrium. Thus, there is no 
reason to consider worst-case equilibria as in [23]: we only 
need to measure the price of sophisticated anarchy. 

IV. Evolution and Equilibria 

Naturally, as the simplex game is iterated, one may assume 
that rational users will want to maximize their payoff by 
employing more often the strategies that perform better. The 
most obvious way to accomplish this is to keep track of a 
strategy's performance and reward it accordingly: 

Definition 9: Let (5 be a simplex game as in definition |3] 
and let o) - (m, si . . . s^) be an instance of ®. Then, the reward 
to the 5* strategy of player / is the random variable: 

= ^Uiim, s^i- s) = -^c;-;- [b(w) + (c™ - c,v")] (16) 

In other words, the reward Wis that player ; awards to his 5''' 
strategy is (a fraction of) the payoff that the strategy would 
have garnered for the player in the given instance! "I 

A seeming problem with the above definition is that, in 
order to learn and evolve, users will have to rate all their 
strategies, i.e. they must be able to calculate the payoff even 
of strategies they did not employ. So, given that the payoff is a 
function of the aggregate bet b, it would seem that users would 
have to be informed of every other user's bet, a prospect that 
downright shatters the unregulated premises of our setting. 
However, a more careful consideration of (|6]l reveals that it 
suffices for users to know the distribution of users among the 
nodes, something which is small enough to be broadcasted by 
the nodes along with the signal wiF^ 

So, let us consider a sequence a>(t) of instances of (5 to 
model the game's r* iteration (f - 0, 1,2 . . .). At time f + 1, 
players rank their strategies according to their scores: 



t/,-,(f+l)=C/,;,(f) + W,;,(w(r)) 



(17) 



where we set f/,.s(0) = to reflect that there is no a priori 
predisposition towards any given strategy. Then, strategies are 
selected according to their scores, following the evolutionary 
scheme of exponential learning (see e.g. [18], [27]) 

f,riUiAt) 

P''^'^ ^ YJe^^ ^^^^ 
where F, represents the learning rate of player i. 

"The rescaling factor has been introduced because significant rewards 
should come only after checking a strategy against at least 0(M) signals. 

'^Actually, the signal itself could be the user distribution of the previous 
stage. This was discussed in [20] where the distinction between real and fake 
memory is seen to have a negligible impact on the game's performance. 



As a first step to understand the dynamical system of ( fTSl ). 
we note that players' evolution actually takes place over the 
time scale t - t/M: it takes an average of 0{M) iterations 
to notice a distinct change in the scores of ( fT6b . In this case, 
the score of a strategy will have been modified by: 6Uis = 

rrirwiAo^it)) = -^2:s-(cf>.i:,„c^5^ + ««)'). But, 

by applying the central limit theorem, we may write 
~ 2;#;(c"'^") and, under some mild ergodic- 
ity assumptions, we can also approximate the time average 
■jg ll^=^i') by the ensemble average -g 2m(')- Thus, the change 
in a strategy's score after M iterations will be: 



^u*{p^i-s) (19) 



A fine point in the above is the implicit assumption that pis 
changes very slowly. This caveat collapses if the learning rates 
F, are too high (i.e. when we approach "hard" best-response 
schemesiil but, if we stay away from this limit, we may pass 
to continuous time and differentiate ( fTSb to obtain: 

At 

since, by (fT9] l. will be given by u*{p-i; s). 

Convergence to the Steady State 



^iPh {ui(p) - u*ip-i; s)) 



(20) 



- Simplex Game (linearised) 
Random Case 
Simplex Game (capacity) 
Action Game 




Fig. 1. Simulation of a simplex game for N = 50 players that seek to connect 
to B = 5 nodes of random stregnths with the help of M = 2 broadcasts and 
S = 2 strategies. The game is iterated based on {Ts) with a learning rate of 
r, = 20, and we plot the users' (instantaneous) frustration R, = --grj (cf- )14t ) 
versus the number of iterations t: as predicted by theorem [TT] players quickly 
converge to a steady state of minimal frustration. To justify the linearisation of 
0, we also simulated a game with the nonhnear payoff' (T), obtaining virtually 
indistinguishable results. As a baseline, we consider unsophisticated users who 
simply pick a node randomly, thus experiencing much higher frustration (on 
average R = 1). Finally, we also simulate rephcator dynamics (with the same 
learning rate) on the congestion game determined by (T); in that case, although 
users eventually reach the Nash solution, they do so at a much slower rate. 

These dynamics are extremely powerful: they are the stan- 
dard multi-population replicator dynamics for the correlated 
form (5* of the game. To be sure, in Weibull's extremely 
comprehensive account [29], it is shown that they exhibit a 
striking equivalence: the asymptotically stable state|3 of (|20|) 
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See [28] for a detailed discussion on this. 
^These ai'e attracting steady states that are also Lyapunov stable. 
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are precisely the (strict) Nash equilibria of the underlying 
game (in our case (5*). But, since a strategy profile is a Nash 
equilibrium for the correlated game (5* if and only if it is a 
correlated equilibrium for the original game ®, this proves: 

Lemma 10: Let (5 be a simplex game, iterated under ( fTSl ). 
Then, almost surely as — > oo, a profile p - {p\ . . . p^) will 
be asymptotically stable w.r.t. the dynamics of ( fTSl l if and only 
if it is a constrained correlated equilibrium of (5. 

So, what remains to be seen is whether the learning scheme 
of ( fTSl ) really does lead the game to such a fortuitous state. 
To that end, one would expect that, as users evolve, they learn 
how to minimize their average frustration level and eventually 
settle down to a stable local minimum. Roughly speaking, this 
is the content of a Lyapunov function, i.e. a function L = L{p) 
with ^ < 0. If such a function exists, Lyapunov's theorem 
will ensure convergence to the steady state and, thankfully, 
there is an obvious candidate: the aggregate payoff u* which 
is also the potential of the correlated game ®*. 

Indeed, if we combine ( fT3l l and (l20t . we can see that: ^ = 

i:,- 2. |- ^ = 5 T/ u*{p^r, s) Pi, {u*{p^r, s)-u*(p)) > 0, 

the last step owing to Jensen's inequality (recall that u*{p) - 
2j PisU*{p-i', s)). In other words, the frustration R - —-gr[U* is 
a Lyapunov function for the dynamics of ( l20l i and the players 
will converge to its global minimum; in effect, this proves: 

Theorem 11: If a simplex game (5 with a large number of 
players is iterated under the exponential learning scheme ( fTSl l, 
the players' mixed strategies will converge almost surely to an 
asymptotically stable state p* with the following properties: 

(i) p* is a (strict) constrained correlated equilibrium of ®; 

(ii) p* is the most efficient equilibrium of (5, in the sense that 
it maximizes the aggregate payoff u* over all p e Hjli ; 

(iii) p* is pure. 

Proof: Thanks to the preceding discussion and Lya- 
punov's second theorem, we only need to prove part (iii). 
But, since u* is harmonic in p, it will attain its maximum 
value on one of the vertices of ^ = ^s- Then, seeing as 
p* maximizes u* by part (ii), it must be pure. ■ 

V. The Price of Anarchy 

So far, we have seen that the dynamics of exponential 
learning lead the users to an evolutionarily stable equilibrium 
which maximizes (on average) their aggregate payoff (given 
their preprogramming). Hence, as far as measuring anarchy is 
concerned, we only need to calculate the level of frustration 
at this steady state: rather surprisingly, it will turn out that 
the price of anarchy is independent of the distribution y of 
the nodes' strengths. In fact, the analytic expression that we 
obtain at the end of this section shows that it is a function 
only of the number B of nodes in the network, the training 
parameter - ^ and the number S of strategies per user 

To begin with, equation ( fTTT i for the frustration level /? at a 
mixed strategy profile p can be rewritten as: 

1 



R{p) 



N(B - 1) 



(21) 



So, recalling definition |6] and the discussion for the aggregate 
payoff ([T2|, the first term of (|2T]> will be: 77^^ 2/ (c^) ~ 1- 



Then, to deal with the second term in (l2n i. note that for a 
given m, the aggregate bet h{m, p) = Yii (cf) gives: b(m, p)^ - 

Zi Zs pI {^tS + 2,' lists' PisPi.K.<' + 2 /../ (c;")-(c7)- Thus, 

to leading order in A^, this expression has an average ofj'^l 



± ^ b(m, pf ~ 2 ,, (c) . (c,) + (B - 1 ) 2,. (22) 
As a result, equations jTH and ( l22b may be combined to: 



where G{p) = -j^Z/Sw^- 

By definition |8] the game's (optimistic) price of anarchy 
/?(©) will simply be the minimum of R{p) over the game's 
equilibria. But, since the minimum of R is an equilibrium by 
theorem [TT] we can simply take the minimum over all mixed 
profiles: /?(©) = mm{R(p) : p e H^Ii ^s)- In this way, we get 
a minimization problem of the kind commonly encountered 
in statistical physics where one seeks to harvest the ground 
states of (similar in form) energy functionals [30]. 

Motivated by this, we introduce the partition function: 



J® 



-l3NRip) 



dp 



(24) 



where & - Yli and dp - Y\i,s ^Pis is Lebesgue measure on 
^0 In this way, we may integrate asymptotically to write 

7?(®) = -llimilogir08,c). (25) 

To proceed, we will make the mild (but important) assumption 
that, for large A^, it matters little which specific strategy matrix 
the users actually picked. More formally: 
Assumption 12 (Self-averaging): For any strategy matrix c: 



log 3f(J3,c) - {log 



all c 



(26) 



almost surely as A^ — > 00 (the averaging (■) takes place over 
all B^^'^ matrices c, drawn according to (O). 

This is a fundamental assumption in statistical physics and 
describes the rarity of configurations which yield notable 
differences in macroscopically observable parameters. Under 
this light, we are left to calculate (log a problem which 
we will attack with the help of replica ana lysisE 

The starting point of the method is the identity {log 3f) - 
lim - log which reduces the problem to powers of 



These are much more manageable since, for n e N: 



(27) 



-mi„R(P,) Y\^dp^ (28) 



'^See also [21] (pp. 529) for more on this point. 

"'.2" depends on the strategy matrix c through the frustration level R(p). 
Essentially, this refers to the fact that maxo / = lim_,^oo -J log dt 
for any measurable function / on a compact domain D (see e.g. [31]). 
'^See [30] for a general discussion or [21], [32] for the minority game. 
"To prove this identity, write iSf" = e"^°^^ and expand. 
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i.e. ^" = U;=i^^^ where = exp(-A037?(p^))dp„ is 
the partition function for the replica = {/',-.s/j) of the 
system. Then, thanks to equation ( |23] |. we obtain: 

{3f"(J3)) = A"j^ ^^e-«(F:T) 2. M'^'ilfje^fil,, g,,(p) j-j^ (29) 

where A - e"^^; c™ = b(m, = l]; 2 , p/i^iC™ is the aggregate 
bet for the mixed profile - (p/j^,) in the repHca (given 
the signal m); and G^vCp) = ]^ 2/ ts Pis/iPLw 

Of course, what we really need is to express (iF") for real 
values of « in the vicinity of « = 0^; for this, we resort to: 

Assumption 13 (Replica Continuity): The expression given 
in ( |29] | for (iF") can be continued analytically to all real values 
of n in the vicinity of n = 0^. 

At first glance, this might appear as a blind leap of faith, 
especially since uniqueness criteria (e.g. log-convexity) are 
absent. However, such criteria can in some cases be established 
(see e.g. [33]) and, moreover, the huge amount of literature 
surrounding this assumption and the agreement of our own 
analysis with our numerical results (see figures |2}{5]) makes us 
feel justified in employing it. 

With the help of the above, and after the lengthy calculations 
of appendix [B] we are in a position to prove: 

Theore}n 14 (Irrelevance of Node Strengths): Let y, y' € 
Int(AB) be strength distributions for B nodes and let (5, (5' 
be simplex games for y and y' respectively. Then, as — > oo: 

R{&) ~ /?(©') (30) 

In other words, we are (rather unexpectedly!) reduced to 
the symmetric case of B equivalent nodes: ceteris paribus, the 
price of anarchy depends only on the number of nodes present 
and not on their individual strengths. 

I[idependeiice on Node Strength 

1 I 1 1 1 1 1 n 

* Symmetric Simplex 
Q () . O Aj-bitrary Simplex 
Theory 

0.8 - 




0.5 1 1.5 2 2.5 

X 



Fig. 2. The price of anarchy (i.e. the steady-state fmstration level) as 
a function of the training parameter A = ^ for B = A equivalent nodes 
contrasted to that of 4 nodes employing standards with different spectral 
efficiencies c,.: EVDO-Rev.A (1.06 Mbps), HSDPA (3.91 Mbps), 802.11b 
(11Mbps) and WiMAX (14.1 Mbps) [34]; we simulated W = 50 users with 
S = 2 strategies and averaged over 25 realizations of the game. As predicted 
by theorem 1141 different standai'ds do not affect the price of anai'chy. 

Now, in order to actually determine the effect of choices on 
the users' frustration level, we first define the binary reduction 



of a simplex game ® for B nodes. This is just a simplex game 
®eff for 2 identical nodes and a training set enlarged by B - 1, 
i.e. Meif - M{B - 1); everything else remains the same. Then, 
under this rescaling, the same train of calculations that is used 
to prove theorem [T4l also yields: 

Theorem 15 (Reduction of Choices): The price of anarchy 
for a simplex game (5 is asymptotically equal to that of its 
binary reduction ©es; in other words, as A^— >oo: 

7?(®) ~ 7?(®eff) (31) 

Thanks to this equivalence, we see that the price of anarchy 
depends on M and B only through M{B - 1); so, for example, 
if some nodes go offline, we will know exactly how much to 
increase M so as to maintain the same performance level. 

However, theorem [15] really tells us much more: it provides 
a "dictionary" between the simplex game and the extensively 
studied minority game. Indeed, mutatis mutandis, one sees that 
the price of anarchy /?(©) corresponds to the market volatility 
cr in the minority game [21]. So, if we follow the (replica- 
symmetric) calculations of [21], we finally obtain the price of 
anarchy in terms of the game's parameters B, S and A = ^: 

R(®)-@(A-A,){l- yfljlf (32) 

where is the Heaviside step function and Ac = A{S,B) is the 
critical value that marks the emergence of anarchy within the 
premises of replica symmetry (see appendix B)r°l 



□ B=2 
— — B-2 (theory) 
B=3 

B^3 (theory) 

O B=5 

B^5 (theory) 




Fig. 3. The effect of choices: we plot the price of anarchy as a function 
of the training parameter A = M/N for N = 50 users, 5=2 strategies and 
B = 2,3,5, 10 nodes (averaging over 25 realizations). We see that efficiency 
deteriorates as B increases: more choices actually confuse the users. 

This expression is one of our key results since it accurately 
captures the impact of the various system parameters on 
the network's performance (see e.g. figures |2]-[5]l. So, even 
though it follows effortlessly by virtue of theorem [15] for the 
sake of completeness (and also to discuss the role of replica 
symmetry), we carry out the derivation of (l32b in appendix IB] 

2" Actually A, = ^ with f(S) = Sll^-^ -JTfn ^""^ze--' eric^ ^Hz) dz). 
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Fig. 4. The effect of sophistication: we plot the piice of anarchy as a function 
of the training parameter A = M/N for N = 5Q users, 5 =2,3,4 strategies and 
6 = 5 nodes (again averaging over 25 realizations of the game). As expected, 
sophisticated users (larger S) are more efficient. 



o N=10 
1 - o N=25 

• N=50 

♦ N=100 

theory 




Fig. 5. The price of anarchy as a function of the training parameter A = M/N 
for different numbers of users N = 10, 25, 50, 100, with B = 5 nodes and S = 2 
strategies; unlike other plots, we are harvesting the price of anarchy from a 
single realization of the game. We see that the number of players does not 
seiiously impact the price of anarchy (except through A). 

VI. Conclusions 

Our main goal was to analyze an unregulated network of 
(a large number of) heterogeneous users that can connect to a 
multitude of wireless nodes with different specifications (e.g. 
different standards). In such a network, users who selfishly try 
to maximize their individual downlink throughput ^ will have 
to compete against each other for the nodes' finite resources. 
So, in the pursuit of order (and in the absence of a central 
overseer), we advocate the use of a training beacon (such as a 
random integer synchronously broadcasted by the nodes) to act 
as a coordination stimulus: by processing this stimulus with 
the aid of some preprogrammed strategies and choosing a node 
accordingly, users should be able to reach an equilibrium. 



Indeed, if users keep records of their strategies' performance 
and rank them based on the evolutionary scheme of expo- 
nential learning ( fTSl ), they learn to coordinate their actions 
and quickly reach an evolutionarily stable state|£3 This state is 
also socially stable in the sense that unilateral deviation is (on 
average) discouraged: it is a correlated equilibrium. Then, to 
measure the efficiency of users in this setting, we examine how 
far they are from the optimal distribution that maximizes their 
aggregate throughput. In so doing, we see that exponential 
learning leads the users to their most efficient equilibrium. 

However, since the users' rationality is bounded (i.e. they 
can only handle a small number of strategies), this equilibrium 
will still be at some distance from the optimal state. This 
distance is the price of (correlated) anarchy and we calculate 
it with the method of replicas. Interestingly, we find (theorem 
I141 l that the price of anarchy does not depend on the nodes ' 
characteristics, but only on their number. In fact, we provide a 
reduction of our scenario to the minority game [19] (theorem 
[TsT i and, as a result, we obtain the analytic expression ( |32] | for 
the price of anarchy. This also generalizes the results obtained 
for the minority game to an arbitrary number of choices. 

Thanks to the above, we derive quantitative predictions 
about the degree of anarchy in our scenario. For example (fig. 
O, we see that blindly adding more nodes to a network is not a 
panacea: anarchy actually increases with the number of nodes 
because the users are not able to process the added complexity 
and do not make efficient use of the extra resources. On the 
other hand, if users become more sophisticated and employ 
more strategies (fig.lH), anarchy comes at a lesser price (albeit 
at a slower convergence to a stable state). Finally, we see that 
the number of users really doesn't have to be quite so large 
(fig. HJ: these conclusions hold even for the much smaller 
numbers of users typically encountered in local service areas. 

Appendix A 
Properties of y-SiMPLicES 

We begin here by showing that definition [T] is not vacuous: 

Lemma 16: There exists a y-simplex - {Qi lf^i £ R*"' 
for any y e Int(AB). 

Proof: Begin by selecting a vector qi e such that 

= and choose q;.+i 6 inductively so that it satisfies 
(|5|l when multiplied by qi ...q^. Such a selection is always 
possible for r < B - 1 thanks to the dimension of R^; for a 
vector space of lesser dimension, this is no longer the case. 

In this way, we obtain B vectors q,. e R* that satisfy (|5]i; 
our construction will be complete once we show that 3§ is 
contained in some {B - l)-subspace of R^. However, as in the 
proof of lemma |2] we can see that Sf^iyrQr - 0; this means 
that is linearly dependent and completes our proof. ■ 

The next lemma is a key property of y-simplices that plays 
a crucial role in the calculations of appendix iBl 

-'in figure[T]we see that convergence occurs within tens of iterations. Thus, 
if each iteration is of the order of milliseconds (a reasonable transmission 
timescale for wideband wireless networks), this coiTesponds to equilibration 
times of tens of milliseconds. 

^^Note that ^ j; + jj r, I so that q^-q^ > (q,--q;)^ holds for js). 
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Lemma 17: Let ^ = {q<)f=i £ 
some y € Int(AB). Then, for all x € I 



be a y- simplex for 



Proof: Since y e Int(AB), will span R*"' and x may 
be written as a linear combination x = Yj^=\ Xi<lr- So, if we 
let S = Zif=i -^^r and recall that XfrziJ/- - 1' we will have: 
= i:t=i ^/^rQr-q/ = -S^ + Zti 4/yr- Similarly: (q.-x)^ = 
-2S — + ^, and an addition over r yields the lemma. ■ 

Appendix B 
Measuring the Price of Anarchy 

Picking up where we left off in section [Vl we begin by 
calculating the expression for (^") in ( |29] l. To do this, we will 
use the identity: e^T = ^^-L^ e'l-^^T dz = Ej e'l'^ where 
denotes expectation over a Gaussian random vector z with k in- 
dependent components z\ . . .Zk ~ A/'(0, 1); this is the Hubbard- 

I i/»=I...iW 

Stratonovich transformation. So, if z'" = (z'",. . ,) 

I ^ f,t>-i li_i=l...n 

are such vectors of R*"', we get: 

, g-Bilrn = E,,"M / e' 2/ z™ ''";-<;\ (33) 



where: x™ = ./^JJ^ 2^ P;.,;.z; e R*-'. Then, by the in- 
dependence of the c,'s (eq. ([S)), we will be able to obtain 
the average {■) of ( |33] ) over the matrices c by computing the 
characteristic function ^e""*^ for only one of them. This is 
done in the following: 

Lemma 18: Let y e Int(AB) and let ^ = {q/-)f=i be a y- 
simplex in R*"'. If x e R^"' and q is a random vector with 
distribution P(q = q^) - yr, then: ^e'" ''^ = e^T + (9(|xp). 
Proof: Expanding the exponential {exp(ix q)) yields: 

(e-'>) = (l+ix-q-i(x-q)2+0(|xp)) 

= 1 + ix-2,3;,q, - \ Z.y.Cq.-xf + 0(|x|3) 

= 1 - ix2 -k9 (|xh^) = e-i"' + O (|xp) (34) 

where the third equality comes from lemmas |2] and [17] ■ 

In our case, |x'."| = 0{M^'-); so, if we apply the previous 
lemma to each of the random vectors c™, the average of eq. 
dSST l will become (to leading order in A^): 



' Z Z Z x";-c» 



-1 z (x;:;)' -iifrrZ z G;-v(p) 



where /I = ^ is the game's training parameter 
Now, if we introduce the n x « matrix J = I -H 



and recall that e^i Zp,,.-'kvhv,hV(J^ _ |det(J)|'2j3 we may 
integrate over the auxiliary variables z™ to obtain: 



(35) 



G(/7), 



e 

M(B-l) 



J Zm=l Zf=l' Tjij.v Jfiy(P) ^"l!\'.k (J^ 



V. m I e 'V/(fi-i) Zf( Zij((cji') 

I e-iZ,.vV(/')">.HV^ ^g-^logdet(J(p)) (3g) 



So, after these calculations, equation ( |29] l finally becomes: 



n^^dp^ (37) 



A" joj,. 

^^Here, tildes as in dw denote Lebesgue measure nomialized by VItt. 



Clearly, this last expression is independent of the strength 
distribution y, a fact which proves theorem [141 In addition, 
we observe that ( |37] i remains invariant when we pass from the 
game © to its binary reduction ©eff with the rescaled training 
parameter A^s - A(B - 1), thus proving theorem [TSl as well. 

Now, to proceed from (i37] i. we will introduce n- 5-functions 
in their integral representation so as to isolate the profiles pis'. 
6(Q-G{p)) = (^V'" j'e''^/'Z„,,.Ue„ -c,.(P)) In this 

way, the integral of ( I371 i becomes: 

J g-'V/?!^ logdet(l+:5i^Q)-tr(Q)-i Z,,,, A,„(G,„-C,„(p))] ^g-, 

where dcr - ]~[^ dp^ x n d^^y x n dg/i.v is the product measure 

on S^" X R"' X R"" . However, p only appears in the last term 
of the ( l38T l and can be integrated separately to yield: 



f 



e n^, dp^ = n 

;= 1 



- exp 



A^log 



I e 



Z kf,y Z,, 



\ 



n d/?.,;j 



(39) 



(recall that Gf,y(p) = jj XiXs PisfiPisv and ^" 
by descending to the limit — > oo, we find: 



{Asf""'). So, 



llog<ir'W~-y8 



n + 



i 2 ^fivQfiv 



;log 



^(fi-1) 
2/3 

ifi Z Z P.ifiP; 



logdet(l+-5j^Q)-tr(Q)- 



n <ip.^ii 

s,ii 



-13 K (40) 



where Q and k extremize the function A within the brackets. 
This is where we will invoke replica symmetry (see [30], [32]). 

Assumption 19 (Replica Symmetry): The saddle-points of 
A are of the form: 

Qt,v^q + (Q-q)5^,y; k^y^iA/3^[r + (R-r)d^y) (41) 

In other words, we seek saddle-point matrices that are sym- 
metric in the replica space (the scaling factors are there for 
future convenience)]^ Under this ansatz, we obtain: 

A = „ + ii|^logdet(f/,+(l + f|^)v)-«e 
+ nA/3^(QR - qr) + n^A/3^qr 

-i log r e-'/'' ¥((«-'•) Z.P^+KZ. p.)') Y\ dp,, (42) 

Ja" 

where is the generic profile {pif^ . . . psfi) in the ju* replica. 

The second term of the above expression can be easily 
calculated by noting that det^q' + p5y_yj - p"{\ + «|): it will 

be equal to ^ + j-p{B' l)log(l +x) + o{n), where ;r = f l^- 
As for the last term of ( l42l i. we will again use the Hubbard- 
Stratonovich transformation with a canonical Gaussian vari- 
able z of M.^ to write: e''*/^'¥(Z,p.)' = E^ e-/^V'-^(B-i)z-Z„p,. 
Then, for notational convenience, we also let: 

y(z,p) = V^^ICS^T)z-p-/l/3^(7?-r)p- (43) 

^''Xhis assumption can actually be dropped; e.g. see [32] where the first step 
of symmetry breaking (IRSB) is performed. Still, replica symmetry does not 
incur a significant en'or on our calculations while greatly simplifying them. 
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and, in this way, the integral of ( l42b becomes (dp - Hi dp.s): 
log r e-^2„ v{z,p,) j-j ^p^^ ^ „ r g-/iv(z,p) + 

J a; s,fi J As 

From dZTl l and the premises of replica continuity (assump- 
tion [T3]l, what we really need to calculate is Ao = lim„^o - A: 



^0 = l + TT7 + ^(-B-l)log(l+;r)-G 



+ A/3^(QR - qr) 



log f e-/^^(^ P> dp 

Jas 



(44) 



where Q,q,R,r have been chosen so as to satisfy the replica- 
symmetric saddle-point equations: ^ - 0, ^ = 0, etc. 

To that end, it can be shown that both Q - q and R - r 
are of order 0(1 //3), i.e. x remains finite as /? ^ oo. So, in 
this limit, we will once again perform asymptotic integration 
for the integrals of ^ = and ^ = 0. Thus, we are led to 
consider the vertex p*(z) of which minimizes the harmonic 
function V{z, ■) and we obtain: 



Q 



R^r + 



P X{B-\) 1+^ 
4 1 



(45) 



where cj) - Ej[pJ(z)] and ( - EJp,(z)-z]. 

Now, if we let y6 — > oo and substitute ( |45] | in (l44l i. we get 
Ao - 1-0 + 0^1 + ^1 yj4>'^{B - 1)) where, after a little geome- 

try: ^(5) = E Jmin{zi . . . )] = #r ze-='erfc^-'(z) dj 

and 0=1. Hence, for finite x (i-C- for A > Ac = ^^)' 
finally acquire expression ( [32] i for the game's pr;ce of anarchy: 

/?(©) ~ Ao ~ 0(^ - ^,) (l - ^/Zm) . 
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